1
Optimality Conditions for Equality Constraints
MATH008 Lesson 10
00:00
Imagine a physical system, like a hanging chain, seeking its lowest energy state. If the endpoints are fixed, the system cannot move freely. Optimality is reached when the internal forces (the gradient of potential energy) are perfectly balanced by the tension forces exerted by the constraints. In convex optimization, this balance is captured by the KKT conditions for equality constraints.

The Geometry of Feasibility

For a convex problem with equality constraints $Ax=b$, a vector $v \in \mathbf{R}^n$ is a feasible direction if $Av = 0$. This means that moving in direction $v$ maintains the constraint: $A(x+v) = b$ if $Ax=b$.

For $x^*$ to be optimal, the directional derivative $\nabla f(x^*)^T v$ must be zero for all feasible directions $v$ in the nullspace $\mathcal{N}(A)$. This implies that the gradient $\nabla f(x^*)$ must be orthogonal to $\mathcal{N}(A)$, leading us to the existence of Lagrange multipliers.

The Optimality Condition

A point $x^*$ is optimal if and only if there exists a vector $\nu^* \in \mathbb{R}^p$ such that:

$\nabla f(x^*) + A^T \nu^* = 0$

This forms a system of linear equations that characterize the equilibrium between the objective's descent and the constraint manifold.

The Projected Gradient

The Euclidean projection of the negative gradient $-\nabla f(x)$ onto the nullspace $\mathcal{N}(A)$ is given by:

$\Delta x_{\text{pg}} = \text{argmin}_{Au=0} \| -\nabla f(x) - u \|_2$

This vector represents the "best" feasible descent direction. Crucially, $x$ is optimal if and only if $\Delta x_{\text{pg}} = 0$.

The KKT System and Matrix Properties

We can solve for the Newton step and dual variables simultaneously using the block system:

$$\begin{bmatrix} I & A^T \\ A & 0 \end{bmatrix} \begin{bmatrix} v \\ w \end{bmatrix} = \begin{bmatrix} -\nabla f(x) \\ 0 \end{bmatrix}$$

Note on Matrix Spectral Properties: The KKT matrix is symmetric but indefinite. If the matrix is nonsingular, it has exactly $n$ positive and $p$ negative eigenvalues. This implies that the optimal point $(x^*, \nu^*)$ is a saddle point of the Lagrangian $L(x, \nu) = f(x) + \nu^T(Ax-b)$, rather than a simple local minimum in the combined primal-dual space.

šŸŽÆ Core Principle
Optimality in equality-constrained problems requires the gradient to be orthogonal to the constraint's nullspace. This allows us to interpret the Newton step as the solution to a linearized approximation of the KKT conditions.